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Abstract. In the past several years high resolution kinematic data sets from Milky 
Way satellite galaxies have confirmed earlier indications that these systems are dark 
matter dominated objects. Further understanding of what these galaxies reveal about 
cosmology and the small scale structure of dark matter relies in large part on a more 
detailed interpretation of their internal kinematics. This article discusses a likelihood 
formalism that extracts important quantities from the kinematic data, including the 
amplitude of rotation, proper motion, and the mass distribution. In the simplest model 
the projected error on the rotational amplitude is shown to be ~ 0.5 km s _1 with ~ 10 3 
stars from either classical or ultra-faint satellites. The galaxy Sculptor is analyzed for 
the presence of a rotational signal; no significant detection of rotation is found, and 
given this result limits are derived on the Sculptor proper motion. A criteria for model 
selection is discussed that determines the parameters required to describe the dark 
matter halo density profiles and the stellar velocity anisotropy. Applied to four data 
sets with a wide range of velocities, the likelihood is found to be more sensitive to 
variations in the slope of the dark matter density profile than variations in the velocity 
anisotropy. Models with variable radial velocity anisotropy are shown to be preferred 
relative to those in which this quantity is constant at all radii in the galaxy. 
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1. Introduction 

Since their initial discovery p], dwarf spheroidals (dSphs) have offered a unique insight 
into the formation of galaxies and structure on the smallest scales. Initially characterized 
as unusual and ghostly stellar systems, photometric studies tended to find that these 
systems contained old stellar populations with no recent signature of star formation 
activity [2]. Though photometrically well-studied since their discovery over seventy 
years ago, as late as nearly 30 years ago minimal was known on the internal kinematic 
properties of their stellar populations or on the kinematic properties of these objects in 
the Milky Way (MW) halo. 

Aaronson [3] provided the first measurement of the line-of-sight velocities of stars in 
Milky Way dSphs. From the spectra of merely three carbon stars, Aaronson suggested 
a mass-to-light ratio for the Draco dSph nearly an order of magnitude greater than that 
of Galactic globular clusters. Follow-up studies of several dSphs, including Sextans, 
Fornax, Ursa Minor, Sculptor, increased the velocity samples by an order of magnitude, 
and in the process established these systems to be dark matter dominated [U EJ [6]. 
It was further suggested that all of these systems share a similar dark matter halo 
mass of ~ [1 — 5] x 10 7 M [6]. Even at the time of these early measurements, it was 
understood that the mass distributions of these systems provide strong constraints on 
the properties of the particle nature of dark matter, including its mass and primordial 
phase-space density [7J [H [9] . 

With the advent of high resolution, multi-object spectroscopy, the velocity samples 
from the brightest dSphs initially studied in Refs. [U G2 [6] have now increased by up 
to three orders of magnitude [TOl [TTJ [12]. These new data sets have revealed that the 
velocity dispersions of the systems are all ~ 10 km s -1 , and in all cases the dispersions 
remain constant even out to the projected radius of the outermost velocity measurements 
Though the data sets have increased by more than ten- fold, the more modern 
analysis of these systems still confirms the global conclusion established from the initial 
observations that dSphs are strongly dark matter dominated [TUl HH [14] . 

Not only has the past several years seen an increase in the kinematic data sets for 
the brightest dSphs, the number of known Milky Way satellites has more than doubled 
due to the Sloan Digital Sky Survey (SDSS). As of the writing of this article, the SDSS 
has discovered 14 new Galactic satellites [T5], [161 El- The new SDSS systems have 
lower luminosities and surface brightnesses than the 11 classical Milky Way satellites 
that were known prior to SDSS. The half-light radius for several of these new objects 
is less than ~ 100 pc; this radius is smaller than the typical half-light radius of the 
classical satellites but still somewhat larger than the typical globular cluster half-light 
radius of ~ 1 — 10 pc. 

Several kinematics studies on the ultra-faint population of SDSS satellites have 
been undertaken in the past several years [T8J HH [20j [21]. Using spectra from eight 
of the SDSS satellites, Simon and Geha [20] concluded that these objects are strongly 
dark matter dominated. Several of the ultra-faint satellites have velocity dispersions 
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as low as ~ 5 km s _1 , making them the most-promising systems to study the phase 
space limits of the dark matter. It has additionally been observed that the ultra-faint 
satellites are the most metal-poor systems known, and that they form a continuation of 
the luminosity metallicity trend set by the brightest dSphs [221 12T] - 

With the above data sets now available, it is becoming increasingly necessary to 
develop better theoretical tools to interpret them. An important aspect of the theoretical 
modeling will necessarily require an interpretation of the kinematic data sets for the 
population of MW satellites; a detailed understanding of these kinematic data sets 
will be important not only for determining the mass distributions of each individual 
system, but for a global comparison to theories of Cold Dark Matter (CDM) [231 IT3] . 
Understanding the mass distributions will also be important for interpretation of limits 
on particle dark matter masses and annihilation cross sections in high-energy gamma- 
ray experiments [211 1251 [26] . Further, understanding the kinematics of these systems 
may eventually reveal whether they have dark matter cusps or cores, which would in 
itself provide a stringent test of the CDM paradigm [27] . 

The primary aim of this article is to discuss a maximum likelihood formalism that 
is used for extracting important physical quantities from dSph kinematic data sets. 
Section [2] begins by reviewing the properties of the kinematic data sets and defining the 
likelihood. Section [3] then uses the likelihood to extract rotational and proper motion 
signals. Section H] discusses mass modeling and a new calculation for model selection. 
Section [5] presents the conclusions. 



2. Likelihood Function and Error Modeling 

Information on the kinematic properties of dSphs is extracted from the line-of-sight 
velocities of their individual stars. This section introduces the likelihood used in the 
data analysis and projections for the errors attainable on several parameters using the 
likelihood. 



2.1. Likelihood Function 



The probability for a velocity data set, v, 

1 



p(v\u, aios) = Yl 



is assumed to be of the form 
K - u) 2 



cxp 



2 (<*L + Olos) 



(1) 



In Eq. [1] the dispersion of the distribution is given by the sum of the measurement 
uncertainty on a star, o~ m , and the intrinsic dispersion of the system at the projected 
radius of the star. The latter quantity is symbolized by o~i os and is determined by the 
model; Section H] below provides more details on this quantity and specifically how it 
relates to the mass of the systems. The systemic line-of-sight velocity in the direction of 
the i th star is given by u. Written in the above form, Eq. [T]may be read as the probability 
for the data set, given the parameters u and ai os . Appealing to Bayes' Theorem and 
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defining the likelihood function as 



C(u,cr los ) = p(u,a los \v) 



(2) 



the parameters u and ai os may be determined directly from the data by the maximization 
of Eq. [2j Equation [2] assumes uniform priors on the model parameters. 

The form of Eq. [I] results from the convolution of Gaussian distribution which 
represents the measurement error on the velocity of a given star with a separate sampling 
distribution that is assumed to be Gaussian. It is the sampling distribution of velocities 
that is connected to physical quantities such as the velocity anisotropy of the stars, 
and the potential of the stellar and dark matter components. For a given model of 
the galaxy, the true line-of-sight velocity distribution function may indeed be non- 
aussian; certain limiting cases of the velocity distribution for analytic potentials have 
been considered in Ref. [28]. This paper shows that when attempting to reconstruct 
the line-of-sight velocity distribution for a given model, degeneracies exist between the 
stellar velocity anisotropy and the stellar and dark matter potentials. Though more 
information may be gained on model parameters if the true velocity distribution were 
known, and thus utilized in the parameter estimation, the Gaussian approximation 
provides the most conservative sampling distribution in reconstructing model parameters 
in variance estimation problems (for a specific discussion of this point, see the discussion 
in Chapter 8 of Ref. [29J). Further, the mass estimations presented here using the 
likelihood in Eq. [2] agree with mass estimates that use a Gaussian likelihood in the 
binned velocity dispersion [3(1 [31]; in this latter case the velocity dispersion does not 
necessarily correspond to the variance of a Gaussian line-of-sight velocity distribution, 
making it self-consistent to determine parameters such as the velocity anisotropy. 

The distribution function in Eq. [1] provides the simplest description of a data set. 
Including higher-order effects naturally introduces a larger set of model parameters. 
The first modification to Eq. [I] from higher order corrections comes from noting that 
the mean velocity, u, varies as a function of the position of the star in the galaxy. This 
variation in the mean velocity results from the fact that, for lines-of-sight with larger 
angles from the line-of-sight directly to the center of the galaxy, the proper motion of 
the object contributes an increasingly larger component to the line-of-sight velocity. 
To describe how the line-of-sight velocity varies as a function of position, consider a 
cartesian coordinate system in which the z-axis points in the direction of the observer 
from the center of the galaxy, the x-axis points in the direction of decreasing right 
ascension, and the y-axis points in the direction of increasing declination. The angle <j) 
is measured counter-clockwise from the positive x-axis, and p is the angular separation 
from the center of the galaxy. The mean line-of-sight velocity is then 



In the small angle approximation, sin p ~ R/D, where R = \/x 2 + y' 2 , and D is the 
distance from the observer to the center of the dSph. Then sin0 = y/R, so that Eq. [3] 
can be written as u = v x x/D + v y y/D — v z . In the limit that the vector pointing from 



u = v x sin p cos <p + v y sin p sin — v z cos p. 



(3) 
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the observer to the center of the galaxy is exactly parallel to the lines-of-sight to each 
star, u ~ —v z . 

Equation [3] show that the line-of-sight velocity of a system increases roughly linearly 
with the increase of the projected distance from the center of the dSph. This effect is 
purely geometric and may be used to recover the proper motion of a dSph with similar 
accuracy to the proper motions attained in ground and space-based measurements 
[32| [T2] ; an application to a specific data set of Sculptor is given below. The extraction 
of dSph proper motions in this manner is analogous to the determination of the proper 
motions for the Large Magellanic Clould |33j and for M31 [34j from their stellar and 
satellite distributions, respectively. 

There may also be rotational motion, in addition to the dominant contribution 
from random motions, present in the galaxy. Though rotation is intrinsic to the 
dynamics of the system and is not purely geometric as that described by Eq. EJ a 
simple parameterization is possible if the rotation amplitude is described by a term 
Asin{(j) — <po), where 0o defines the projected axis of rotation. Adding all of the terms 
together gives the following expression for the line-of-sight velocity of a star, 

u = v x sin p cos 4> + v y sin p sin — v z cos p + A sin(0 — (fi ) . (4) 

With the addition of each of the terms in Eq. HJ our likelihood function now reads 

£(v x ,Vy,v z ,A,(f) ,cr los ) =p(v x ,v y ,v z ,A,(f) ,cri os \v), (5) 

and the vector set of 6 parameters (v x , v y , v z , A, 4> , ai os ) may be directly determined 
from the data. In the sections below these parameters are determined from an example 
data set; before jumping into this data analysis the following sub-section provides a 
discussion of the theoretical predictions for the errors attainable on these quantities. 



2.2. Error Modeling 

From the likelihood function defined in Eqs. [T] and HU the Fisher matrix formalism may 
be used to derive projected errors on the model parameters. For m model parameters 
that are varied, the Fisher matrix is defined as an m by m matrix so that the entry for 
the a th and b th parameters is given by 

/d 2 \nC\ 



ab 



de a de b 



(6) 



Here 9 is a vector defining the set of parameters. In the simple case studied in this 
section, the parameters are given by 6 = {v x , v y , v z , A, (fi , o- los }. According to the Rao- 
Cramer inequality, the minimum possible variance attainable on a parameter using 
m axim um likelihood statistics is given by the inverse of the Fisher information matrix, 
\J^aa- The average in Eq. [6] is taken over the data, and the derivatives are evaluated 
at the true model of parameter space. The inverse of the Fisher matrix thus provides 
an approximation for the true covariance of the parameters, and using F _1 provides a 
good approximation to the errors on parameters that are well-constrained by the data. 
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The Fisher matrix is constructed by differentiating the log of the likelihood function 
in Eq. O It will be understood that the total dispersion of = af os + o 2 m is evaluated at 
the projected radius of the i th star. Averaging over the likelihood function, and using 
the above definition of u, the final expression for the Fisher matrix is 

1 du t du % + 1 1 dal s da? os \ ^ 



ab 



afd9 a d9 b 2 a? 06 a 06 b , 

The sum is over the N number of observed stars in the galaxy. The analysis in this 
section considers the simplified case that af os does not in itself depend on any model 
parameters. A more detailed model would consider this quantity as a function of the 
parameters that describe the mass modeling of the system; this is discussed in more 
detail in Section @] below. 

In the second term in Eq. UJ, the derivatives are with respect to the theory 
dispersion alone, whereas both of the contributions to the variance sum in the 
denominator. For the well-studied satellites, with intrinsic velocity dispersions of 10 
km s _1 , the dispersion from the distribution function dominates the dispersion from 
the measurement uncertainty, while for many of the newly-discovered satellites, both 
contributions to the dispersions are similar. Equation [7J shows that, to determine the 
error on any of the 9 parameters, one must determine 1) the distribution of stars within 
the dSph that have measured velocities, and 2) the error on the velocity of each star. 
This implies that the projected errors are independent of the mean velocity of the stars. 
Additionally, under the approximation that sin p <C 1 and no rotation, the first term in 
Eq. [7J vanishes, and the errors are independent of the parameters describing the mean 
motion of the system. 

The projected errors obtained using Eq. [7J provide an excellent estimate of the 
measured errors on both v x and v y [32j IT2] . Though there has been no conclusive 
detection of a parameter similar to A in published kinematic data samples, it is 
interesting to determine the expected error on this quantity given expected future 
data samples. Figure [I] shows example error projections for A, for two different model 
galaxies. The upper solid curve assumes structural parameters similar to that of Segue 
1, with a Plummer radius of 0.03 kpc and a stellar limiting radius of 0.1 kpc [35]. The 
lower dashed curve assumes structural parameters similar to that of Draco, with a King 
core radius of 0.18 kpc and a King limiting radius of 0.93 kpc [36]. Each curve assumes 
that the measurement uncertainty on each star is 2 km s _1 . In both cases, the stars 
have been uniformly distributed at projected positions in the galaxies; this provides a 
good representation of the present observational configurations. 

In addition to their interesting applications for understanding the rotation and 
proper motion of the dSphs, the calculations presented in this section are crucial for 
uncovering properties of underlying dark matter distributions. For example a strong 
gradient may reflect ongoing tidal disruption, which would clearly affect dark matter 
mass modeling, as is discussed in more detail in Section HI 
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Number of Stars 



Figure 1. The projected one-sigma error on the amplitude of the rotation parameter, 
A, as defined in Eq. 0] The upper solid curve assumes structural parameters similar to 
that of Segue 1, while the lower dashed curve assumes structural parameters similar 
to that of Draco. Each curve assumes that the measurement uncertainty on each star 
is 2 km s _1 . 

3. Proper Motions and Rotation 

This section discusses an application of the maximum likelihood formalism introduced 
in Section [21 with a specific focus on the methodology for extracting an intrinsic 
rotational signal and proper motions using an example data set. Extracting rotation 
from a data set is important for reasons discussed above, and, in addition to its 
phenomenological interests, extracting the proper motions of MW satellites may 
have important implications for understanding the origin of the accretion history of 
MW [37J EBJ El]. Specifically determining the latter would present a unique observational 
test of MW halo formation within the CDM paradigm. 

Several dSphs have kinematics data sets large enough that statistically significant 
constraints may be placed on the parameters v x , v y , and A. For illustrative purposes 
this section considers just one example, the Sculptor dSph. Sculptor is located at a 
distance of 80 kpc and has a measured King limiting radius for its stellar distribution of 
~ 1.6 kpc [1Q]. Given these parameters it is one of the more spatially extended dSphs. 
The mass content of Sculptor has been estimated in several recent papers [HI [HI [13], 
and it has been shown that Sculptor may contain some degree of rotational support [41 J. 
Further, the previous determinations of the proper motion of Sculptor from its line-of- 
sight velocities may indicate a discrepancy between the proper motion as determined 
from this method and from ground and space based measurements [12] . This latter fact 
may in itself be indicative of the presence of an intrinsic rotational component, provided 
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the systematics on the ground and space-based determinations of the Sculptor proper 
motions are well-understood |42j . 

To extract the rotation and proper motion signal, a simplified model is considered 
by assuming the likelihood function is characterized by the six parameters introduced 
in Section [2j It is assumed that the intrinsic dispersion o~i os is uniform throughout the 
galaxy, and does not depend on any of the parameters of the mass modeling introduced 
in Section @] below. Introducing the set of parameters discussed in Section 0] does not 
affect the reconstruction of the parameters discussed in this section since the intrinsic 
dispersion is uncorrelated with the parameters of the function u |32j . 



1.0 1 1 ' 1 ' 1 ' 1 1 1.0 




0.0 i , i , i , I 0.0 ! , i , i , i , i , I 

-3 -2-1 1 -3-2-1012 

[mas yr" 1 ] H 6 [mas yr 1 ] 



Figure 2. Probability distributions for the proper motions of Sculptor using its line- 
of-sight velocity data. The components fj, a and fi$ give the proper motions in the 
directions of increasing right ascension and declination, respectively. 

In order to determine the probability distributions of the parameters 6 = 
(v x ,v y ,v z ,A,(f)o,aios), a standard metropolis-hastings algorithm [29] is used to sample 
the likelihood function as written in Eq. For all runs described here 10 4 accepted 
points were obtained in each chain, with the first 10% excluded to account for a 
conservative burn-in phase. For simplicity, a uniform proposal distribution is assumed 
for each of the parameters over a wide range chosen to encompass physically-accpetable 
values for each of these parameters. The line-of-sight velocity data used is taken from the 
Walker et al. [13] sample, and only those stars with > 90% c.l. probably for membership 
are used in the analysis. 

Figure El shows the posterior probability distributions for the proper motion 
components corresponding to v x and v y , v x [kms~ ] = 4.74[.D/pc][/i Q //iasyr] 
and fy[kms _1 ] = 4.74[£)/pc][/x,j//iasyr], respectively The remaining parameters 
(v x , A, (f) , o~i os ) are marginalized over. Though the distributions in Fig. [2] were obtained 
by allowing the A parameter to float freely, the distributions are found to be relatively 
unaffected if A is instead fixed so that A = 0. This reflects the fact that the radial 
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gradient in the velocity of the stars is distinct from the intrinsic rotational component, 
which has a sinusoidal behavior as a functional of the position angle. The results 
presented in Fig. [2] are in agreement with the measurements of Walker et al. [12j, though 
here a larger set of parameters is marginalized over in this analysis. 

Figure [3] shows the corresponding probability distribution for the rotational 
parameter A. Again the remaining five parameters (v x , v y , v z , </>o, o~i os ) are marginalized 
over. The result is that, given the rotational parameterization and the using entire 
distribution of 1352 Sculptor stars, there is no statistically significant detection of 
rotation. From figure [3] the 90%. c.l. upper limit on the rotation is found to be ~ 2 
km s -1 . The result presented in figure [3] is somewhat degenerate with the parameters 
describing u; for example if v x and v y were (unphysically) set to zero, the implied upper 
limit on A reduces by about 50%. 




Figure 3. The probability for the rotational amplitude, A, using the Sculptor line-of- 
sight velocity data. 



Figure [3] represents the averaged value of A throughout the entire galaxy. It may be 
possible that the rotation amplitude in the outer region differs from the rotation rate in 
the inner region; if this were the case then it is plausible that this effect is washed out in 
the averaging process. To provide a simple test for a possible differential rotation rate, 
an additional likelihood analysis was considered with just the outer sample of Sculptor 
stars. Here the outer stars are defined as only those with projected radius beyond 0.5 
kpc. Even in this case, there is no statistically significant detection of A, though in this 
case the 90% c.l. upper limit increases to 10 km s" 1 . 
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4. Mass Distributions and Model Selection Criteria 

This section discusses the extension of the maximum likelihood analysis developed in 
Section [2j with a goal of using the kinematic data to determine dark matter mass 
distributions. A calculation of the gravitating mass of a stellar system is one of the more 
fundamental tasks in astronomy, and simple scaling arguments provide some guidance 
to anticipate the results. It is worthwhile to first review these arguments as applied to 
the dSphs before undertaking a more detailed and model dependent treatment. 

4-1. Spherical Mass Modeling 

Initial Estimates - Under the assumption that a star cluster is spherically-symmetric, 
the orbital distribution of the tracer particles are isotropic, that mass follows light, 
and the cluster is isolated from any external gravitational potential, the virial theorem 
provides a mass estimate of M vir ~ r e a\ /G, where r e is the observed extent of the cluster 
and a* is the velocity dispersion of the stars. Although this is probably the simplest 
estimate one can make for the mass of a star cluster, it does provide a useful extremum 
bound. For example Merritt [44] has show that the virial theorem may be used to derive 
a lower bound on the mass of a star cluster, which is obtained from the assumption that 
all of the mass is concentrated as a point in the center. This minimum mass is given by 
M m i n = 3cr*/ (r~ l )G, where l/(r _1 ) is the harmonic mean stellar radius in the cluster. 

Of course for dSphs it is not consistent to assume that these systems are isolated, 
since they are orbiting within the extended dark matter halo of the MW. For dSphs 
orbiting with the MW halo, the minimum mass estimate above is particularly useful, 
as it in turn provides a conservative estimate of the radius at which particles would be 
stripped due to the MW potential. As an example consider the case of Segue 1, which 
is a MW satellite with a stellar luminosity ~ 340 L at a Galactocentric distance of 
28 kpc. From the de-projected light distribution, the harmonic mean stellar radius is 
~ 10 pc, and given the velocity dispersion of 4.3 km s _1 [21], the implied minimum 
mass of Segue 1 is ~ 4 x 10 5 M . Assuming that Segue 1 is a point mass orbiting in the 
potential of the MW, the radius at which particles would be presently getting stripped 
is the Jacobi radius, r t = [M J 3Mmw]^ 3 D , where D = 28 kpc. Assuming the minimum 
mass of M — M m i n , r t ~ 300 pc. It is important to note that this provides only an 
estimate of the instantaneous tidal radius; if Segue 1 came significantly closer to the 
MW in the past then this estimate would differ. The above estimate provides a lower 
bound on the radius at which particles would be getting stripped, under the assumption 
of a circular orbit. A similar argument for the tidal radius of Segue 1 was considered 
in Geha et al. |21j using the Illingworth approximation for the mass as M m « n (For an 
alternative interpretation for the origin of Segue 1, see Ref. [4"5]). 

Jeans Equation - At the next level of detail from the dynamical perspective, an 
estimate for the mass of the dSphs may be obtained by appealing to the spherically- 
symmetric jeans equation, assuming that the gravitating mass of the system consists of 
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both stars and dark matter. The analysis here closely follows the treatment given in the 
appendix of Strigari et al. |13j, and refers to this paper for further details. A standard 
discussion of the spherical jeans equations comes from Ref. [46J. 
The spherical jeans equation is 

r^£^ = - P *(r)V?(r) - 2(3{r)p^ r . (8) 

Here p± is the de-projected stellar density profile, the circular velocity is V c (r) = GM/r, 
and the parameter /3(r) = l—a 2 /a 2 characterizes the difference between the radial and 
tangential velocity dispersions of the stars. Integrating a 2 along the line-of-sight gives 
the velocity dispersion as a function of projected radius, R, 



I(R) Jr V r2 J Vr 2 - R 2 
Here, I(R) is the projected surface density of the stellar distribution, and p± is the 
three-dimensional stellar distribution. In Eq. (Q, a r depends on the parameterization 
of the mass distribution of the dark matter component. The stellar density profile is 
taken to be fixed; for example the measurements of the projected density profiles for 
many of the classical satellites come from Ref. [17], and more updated profiles from, 
e.g., Refs. [361 SHI HH1 EQ] , while measurements of the density profiles for the ultra- faint 
satellite come from Ref. [35]. It is important to note that fixing the stellar density 
profile may introduce a degeneracy in determining the projected velocity dispersion 
profile, particularly in the central regions |51j . However the effect on the integrated 
mass distributions as considered here is less severe, motivating the assumption of fixing 
p*, rather than marginalizing over it, in the analysis. 

Given the above assumption for the velocity anisotropy of the stars and for the shape 
of the dark matter profile for the galaxy, the likelihood function can now schematically 
be written as 

C[y, a los 0, $)] = p[y, (Tios0, (10) 

For compactness, the vector y = (v x , v y , v z , A, </> ) has been defined, and f3 and $ are 
vectors that describe the stellar velocity anisotropy and the gravitational potential of 
the system, respectively. The line-of-sight velocity dispersion is dependent on (3 and 
$ through the spherical jeans equation. The mass of the system, as well as quantities 
related to the mass distribution, are determined via <3>, and thus by integrating out the 
model parameters one may determine the probability distribution for the mass of the 
system contained within a fixed physical radius. 



Error Projections on Mass Distribution - Before performing an example calculation 
using Eq. [TO], it is interesting to get an idea as to how the errors on the mass distribution 
depend on the physical radius within which the mass is determined. To perform these 
estimates, we again appeal to the Fisher matrix formalism outlined above. However the 
analysis here is different from above in that now the likelihood depends on the vector 
set of parameters (3 and $ in addition to y. 
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The example considered here uses the velocity data sample from Fornax of Walker 
et al. [13], specifically the stars with > 90% c.l. for membership. This gives a total 
of 2409 Fornax members. The three-dimensional surface density profile for Fornax is 
assumed to take the form 

P - (r)0C ^(l + xV-)/ & e '^' (U) 
with the parameters {a, b, c, [r /kpc], [r cut /kpc]} = {0.3,1.2,3.0,0.8,1.1}. A profile 
of this form with these parameters is consistent with the recent measurements of 
Fornax star counts [50] , though generally the results presented are independent of the 
normalization of the surface density profile. The stellar mass-to-light ratio is assumed 
to be unity, consistent with the results presented in Ref. [50J. The dark matter density 
profile is assumed to be the einasto profile, 

ln[p(r)/p_ 2 ] = (-2/a)[(r/r_ 2 ) a - 1], (12) 

and following CDM simulations, a = 0.17 [52]. The velocity anisotropy is assumed to 
be of the following form, 

P(r) = (Poo - (3 )r 2 /(rl + r 2 ) + ft. (13) 

Thus in the Fisher matrix calculation the base set of parameters are now given by 
= {p-2, T-2i Po, Pi, Tp} (the rotational and geometric parameters, y are ignored here: 
this is justified given that the Fornax data is consistent with A = and that the y 
parameters do not correlate with the parameters that determine the mass). 

Given the base set of parameters in 9 used to calculate F, the error on a derived 
parameter, g, is given by 




The derived parameter specifically considered here is the log of the mass within a given 
fixed physical radius (See Ref. [53] for another example where the derived parameter 
considered is the log slope of the dark matter density profile) . Where desired Gaussian 
priors may be taken by simply adding 1 / o\ a to the aa component of the Fisher matrix. 

Figure H] shows the error on the log of mass as a function of the physical radius 
within which the mass is measured. Here the fiducial baseline parameters for the 
velocity anisotropy have been taken as {Po, Pi, rp/kpc} = {—0.5,0,0.2}, implying 
slightly tangential orbits in the central region of the halo and isotropic orbits at outer 
radii. Different combinations of {p_2,T-2} have been taken as indicated to represent 
the degeneracy between these two parameters when fitting the data. Each of these 
parameters sets, combined with an anisotropy model, produces a velocity dispersion 
profile that roughly fits the profile of Fornax. While the goal here is to not undertake 
a direct fit to the data and to explore the exact degeneracy space of these parameters, 
examining these three sets of fiducial parameters gives a feel for how the constraints on 
the mass depend on the fiducial parameter set. Priors on each of r_ 2 and rg are taken 
as 1/(5 kpc) 2 , while priors on j3 and Pi are taken as l/l 2 . Each of these priors are 
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motivated by the range of these parameters scanned in the algorithm described in the 
sub-section below. As is seen, for the stellar profile considered above and the fiducial 
set of parameters taken, the best-constrained mass is at a radius ~ 0.6 — 1.0 kpc. 
This best constrained radius is found to be relatively weakly dependent on the sets of 
fiducial parameters, particularly near the best constrained mass, provided that they give 
a good fit to both the star count and velocity dispersion data of Fornax. For kinematic 
data sets that have been analyzed, the mass is a seen to be strongly-constrained at 
the approximate half-light radius, which is a general property of dispersion supported 
systems [5U [55] . 




n.oo l .... i .... i .... i .... I 

12 3 4 

Radius [kpc] 

Figure 4. Projections for the error on the log of the mass within a given physical radius 
for Fornax. The position and the errors on the stars from the Walker et al. [13] data 
sample have been used. Each curve assumes a different fiducial model for the einasto 
parameters of the dark matter halo: p_2 = 2 x 10 7 M© kpc~ 3 and r_ 2 = 1 kpc (solid 
black), p-2 = 1 x 10 7 M© kpc -3 and r_ 2 = 2 kpc (dotted red), and p_ 2 = 0.5 x 10 7 
M kpc~ 3 and r_ 2 = 5 kpc (dashed blue). 



Fornax Mass Distribution - The probability for the mass distribution of Fornax is 
now determined directly from the kinematic data, and compared to the projected error 
on the mass distribution as determined from Fig. [U As above, a metropolis hastings 
algorithm is used to determine the respective parameter distributions, and the same data 
for both the star counts and the line-of-sight velocity distribution have been used. In 
the parameter scan, uniform priors have been taken on each of the parameters over the 
following ranges as follows: {log 10 [p_ 2 /(M kpc -3 )], r_ 2 /kpc, /3 , /3 , r^} = {[6 : 10], [0 : 
10], [-5:1], [-5:1], [0:10]}. 

Figure [5] shows two example probability distributions for the Fornax mass, within 
0.6 kpc (left) and within the approximate Fornax stellar tidal radius of 3 kpc (right). 
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Figure 5. The probability distribution for the mass of Fornax within 0.6 kpc (left) 
and the mass within its stellar tidal radius (right), defined here to be 3 kpc. 

The probability distributions are seen to be slightly non-gaussian, particularly the M(3 
kpc) distribution. Comparing the approximate width of each of these distributions 
with the errors projected in Fig. H] provides generally good agreement, in spite of the 
intrinsic assumption in the Fisher matrix formalism that the errors on the parameters are 
Gaussian. Specifically for the left panel, a Gaussian fit gives log 10 [M (0.6 kpc)/M ] = 
7.47 ± 0.04. These results confirm the general trend seen in Fig. H]that the error on the 
integrated mass within a fixed physical radius increases at larger radii towards outer 
regions of the halo. 

Results of the calculations for the mass distributions of the entire population of 
dSphs are presented in Refs. [23J, [131 Ell E2]. These results, as well as more recent 
determinations, show that the central mass distributions for the dSphs are very similar, 
despite an over four order of magnitude variation in their luminosities. The average 
density within a spherical radius of ~ 0.3 kpc is ~ 0.1 M Q pc~ 3 ; for the brightest 
satellites baryons can contribute to the potential in this central region, while for the 
least luminous satellites the potential is dominated by dark matter within this region. 
Within the context of spherical models, these constant central density results are robust 
to the specific parameterization of the mass distribution, primarily due to the fact that 
the integrated mass is directly constrained via the jeans equation and the approximately 
similar scale for the velocity dispersion profiles |13j . 

Model Selection - The likelihood formalism introduced above does not give any 
information regarding the optimal parameterization of the dark matter mass profile. 
For example referring to the calculation above, is the einasto profile with just two free 
parameters an acceptable description of the data? Given the parameterization of the 
dynamics via the spherical jeans equation, we can answer this question and determine 
how many parameters are required to describe the mass profile given the maximum 
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likelihood formalism. Moreover, we can determine how the parameterization of the 
density profile depends on the given data sets. For example Segue 1, with only 24 
measured line-of-sight velocities, may require a smaller set of parameters than does 
Fornax, which has ~ 2400 measured line-of-sight velocities. 

To specifically answer the question of how to determine the appropriate set of 
parameters in maximum likelihood theory one may appeal to the bayes evidence. For 
the purposes here the evidence, E, is defined as the integral of the likelihood in Eq. [TU1 
over all of the model parameters. When comparing models, the ratio of their respective 
evidences gives an idea of how much more probable one model is over another. For 
example if 1 < Ami? < 2.5, the different between the two models is substantial; for 
2.5 < AlnE < 5, the different between the two models is strong, and for AlnE > 5 
the difference between the two models is decisive [56J. 

As an illustration, four different dSphs that span a wide range in their respective 
number of velocities are considered: Segue 1, Sextans, Sculptor, and Fornax. These 
dSphs have 24, 424, 1352, and 2409 stars respectively; for the latter three galaxies 
we consider only those stars that have a probability of > 90% membership from the 
Walker et al. [13] sample. For each dSph we determine how many model parameters are 
necessary to describe the data, and we consider several different models. 

For the "Baseline" 3 parameter model, the following range of parameter space 
is integrated over for Fornax, Sculptor, and Sextans: log lo [po/M kpc~ 3 ] = [6 : 10], 
log 10 [r /kpc] = [—1 : 1]. The velocity anisotropy is assumed to be a constant, (3, with a 
range given by (3 = /3q = fix — [—2 : 0.5]. For Segue 1 the ranges are the same except 
for the scale radius, which is taken to vary over the range log 10 [r /kpc] = [—2 : 0]; this 
range is motivated by the likely upper limit to the dark matter tidal radius for Segue 
1 [21] . All of the ranges above are chosen as plausible values to describe the halos of 
dSphs. A flat prior is chosen over these regions; as a further detail one may chose a 
prior that weights each of these parameters differently, for by example considering the 
scatter in the p s — r s relation as seen in CDM simulations [57]. The bayes evidence for 
the Baseline model will be denoted as Eq. 

Three different models are compared to this Baseline 3 parameter model: i) a model 
in which the parameter space for a is enlarged in the range [0.14 : 0.3] (corresponding 
to 1/a ~ [3 — 7]), so that central slopes that are both more flat and more steep than the 
CDM value are allowed; ii) a model with the Baseline 3-parameter p(r) profile, but with 
a three-parameter velocity anisotropy profile which depends on the three-dimensional 
physical radius as in Eq. [TBI and iii) a model in which a = [0.14 : 0.3] and the f3(r) 
profile in Eq. [13] is assumed. 

Model i) is thus described by four parameters, while model ii) is described by five 
free parameters, and model iii) is described by six free parameters. In Table [TJ, we define 
model i) as the "Exp" model, model ii) is denoted as the "/3" model, and model iii) is 
denoted as u /3+ EXP". These models provide useful illustrations of the calculation of 
the evidence as applied to dSphs; alternative models may of course be defined and even 
larger parameter spaces may be explored. The utility of the above models as defined 
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dSph 


Segue 1 


Sextans 


Sculptor 


Fornax 


# of stars 


24 


424 


1352 


2409 


Exp 


3.4 


3.3 


3.3 


3.6 


P 


1.6 


0.9 


0.8 


1.3 


(3 + Exp 


5.0 


4.3 


4.1 


4.9 



Table 1. Parameter ranges and evidences for various models. The columns are for 
different dSphs, and the rows are for different models as described in the text. Each 
entry in the table gives the ratio of the Evidence for the given model, -E mo dci, with 
respect to the Baseline 3 parameter model, E , defined in the text, ln(i? moc jei/-Eo)- 

allows us to explore to what extent CDM-like inner slopes are more favored, and to 
what extent an alternative parameterization of the velocity anisotropy provides a better 
fit to the data as compared to simply changing the value of the inner slope. 

The results for the ratio of the bayes evidence for the various models, relative to 
E , are shown in Table [H For each of the galaxies, we see roughly the same pattern; as 
more parameters are added, the better that the model fits the data. This result implies 
that models with larger sets of parameters are favored even after penalization for the 
larger volume of parameter space that is integrated over. Allowing for a larger volume 
of parameter space for the dark matter density profile affects the evidence more than 
simply varying the shape of the anisotropy profile. In total, the best fitting models are 
those that allow both the velocity anisotropy and the central slope to vary freely, i.e. 
the 0+ EXP models. 

The results in Tabled] indicate that for all four galaxies variable velocity anisotropics 
are slightly preferred relative to those with constant velocity anisotropy, and that central 
dark matter profiles both less cuspy and more cuspy than ACDM based fits are equally 
acceptable. Future data sets, both line-of-sight velocities and potential proper motion 
measurements for stars in dSphs [58l 1531 [59] , will be important in narrowing the 
acceptable ranges for both the velocity anisotropy and the central slope. 

5. Conclusion 

This article has discussed the analysis of kinematic data from Milky Way dwarf 
spheroidals, with a primary motivation of 1) understanding physical quantities that 
are well-constrained by the data and 2) understanding the systematics that underly the 
determination of the dark matter masses of these systems, given the simplest assumption 
that the dSphs are purely pressure supported systems. Of the possible systematics 
perhaps the most significant and observationally- accessible is the determination of a 
velocity gradient in the data sample, which may be indicative of tidal disruption from 
the potential well of the Milky Way. The results in the literature indicate that, based on 
the kinematic data alone, velocity gradients due to tidal disruption or rotation are not 
conclusively present in any of the dSphs. This article has provided an example, using a 
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simple parameterization, of how to search for rotation in the kinematic data sets using 
a maximum likelihood analysis. The kinematic sample of Sculptor was analyzed, and 
it was found that the maximum likelihood rotational amplitude is zero, with an upper 
limit of ~ 2 km s _1 at 90% c.l. The magnitude of these errors are consistent with the 
projected magnitude of the errors from theoretical modeling. 

When modeling the mass distribution of the dark matter halos of the dSphs, 
degeneracies between model parameters affect the determination of the total mass 
profiles, even in the context of the simplest spherical models. To shed light on these 
degeneracies, this article has discussed a new criteria for model selection applied to the 
dSph kinematic data sets, taking a step towards determining how many parameters 
are needed to describe the mass distribution of spherical halos. For the four dSphs 
studied here, chosen because they have a wide range of available line-of-sight velocities, 
it is shown that, assuming CDM- motivated Einasto profiles for the dark matter halos, 
models with variable velocity anisotropy are slightly preferred relative to those with 
constant velocity anisotropy. Further, central slopes for the dark matter profile that 
are found in CDM simulations are not a unique description of the data sets; both more 
cuspy and less cuspy models are allowed for the central slope. This is primarily due 
to the degeneracy between the central dark matter slope with the central stellar profile 
and the velocity anisotropy distribution |53j. 

Future photometric and kinematic data sets promise to further pin down the mass 
distributions of the dSph dark matter halos. Upcoming data for the ultra-faint satellites 
will be particularly important, and may be able to show whether any tidal effects are 
present in these galaxies. Further, development of non-spherical distributions for both 
the light and dark matter should be considered given these data sets (for initial results 
along these lines see Ref [60]). Controlling systematics in these data sets will prove to be 
important step towards further testing the currently favored ACDM theory of structure 
formation. 
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